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ABSTRACT 

We present a new method based on the N-point probability distribution (pdf) to 
study non-Gaussianity in cosmic microwave background (CMB) maps. Likelihood and 
Bayesian estimation are applied to a local non-linear perturbed model up to third 
order, characterized by a linear term which is described by a Gaussian N-pdf, and 
a second and third order terms which are proportional to the square and the cube 
of the linear one. We also explore a set of model selection techniques (the Akaike 
and the Bayesian Information Criteria, the minimum description length, the Bayesian 
Evidence and the Generalized Likelihood Ratio Test) and their application to decide 
whether a given data set is better described by the proposed local non-Gaussian model, 
rather than by the standard Gaussian temperature distribution. As an application, we 
consider the analysis of the WMAP 5-year data at a resolution of rj 2°. At this angular 
scale (the Sachs- Wolfe regime), the non-Gaussian description proposed in this work 
defaults (under certain conditions) to an approximative loca l form of the weak non- 
linear coupling inflationary model fe.g. iKomatsu et al.ll200l[ ) previously addressed in 
the literature. For this particular case, we obtain an estimation for the non-linear 
coupling parameter of —94 < fjjL < 154 at 95% CL. Equally, model selection criteria 
also indicate that the Gaussian hypothesis is favored against the particular local non- 
Gaussian model proposed in this work. This result is in agreement with previous 
findings obtained for equivalent non-Gaussian models and with different non-Gaussian 
estimators. However, our estimator based on the N-pdf is more efficient than previous 
estimators and, therefore, provides tighter constraints on the coupling parameter at 
degree angular resolution. 

Key words: cosmology: observations - cosmology: cosmic microwave background - 
methods: data analysis methods: statistical 



1 INTRODUCTION 

The Cosmic Microwave Background (CMB) fluctuations, a 
relic radiation originated around 400,000 years after the Big- 
Bang, is one of the most outstanding sources for understand- 
ing the evolution and energy /matter content of the Uni- 
verse. The large amount of high quality data provided by 
recent CMB experiments and other complementary astro- 
nomical observations, have provided with a consistent pic- 
ture for a flat Universe filled with cold dark matter (CDM) 
and dark energy in the form of a cosmological constant 
(A), plus the standard baryonic and electromagnetic com - 
ponents: the concordance model fe.g. iKomatsu et alj [2008h 
However, beyond the strength of the CMB measurements 
to put constraints on the cosmological parameters, like the 
ones already provided by the NASA Wilkinson Microwave 



Anisotropy Probe (WMAP, H inshaw et al. 1 120081 ) and the 
ones expected from the incoming ESA Planck satellite, the 
CMB is a unique tool to probe fundamental principles and 
assumptions of the so-called standard model. In particular, 
the application of sophisticated statistical analysis to CMB 
data might help us to understand whether the temperature 
fluctuations of the primordial radiation are compatible with 
the fundamental isotropic and Gaussian predictions from the 
inflationary phase. The basic inflationary scenario relates 
the CMB fluctuations, as well as the large-scale structure 
of the Universe, to the Gaussian quantum energy density 
pertur bations present durin g the early Universe (see for in- 
stance iLiddle^&^XxtflllQO^). The present homogeneity and 
isotropy of the Universe is compatible with an inflation- 
ary era in the early universe and this idea is the only way, 
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nowadays, to explain efficiently current observations rang- 
ing from galaxies to the CMB. In particular, this fundamen- 
tal hypothesis predicts that the CMB fluctuations follow an 
isotropic and Gaussian random. In fact, the estimation of the 
cosmological parameters defining the concordance model is 
done by assuming these statistical properties. 

The quality of current CMB data (in particular the 
ones provided by the WMAP satellite) has allowed for a 
systematic probe of the statistical properties of the relic 
radiation. Indeed, the interest of the scientific community 
in this field has experimented a significant growth, since 
the analysis of the WMAP data has reported several hints 
for departure from isotropy and Gaussianity of the CMB 
temperatu re dis t ributi on. Some of these works are the 
following. IParkl (|2004l ) detec ted a Gauss i anity deviation 
with a genus-based statistic; I Vielva et alJ (|2004l ) found a 
significant non-Gaussian signature on the 1-year WMAP 
data on the kurtosis of the Spherical Mexican Hat (SMHW) 
wavelet coefficients at scales of around 10 degrees, pointing 
out a very large cold spot (CS) on the southern hemisphere 
as a pos sible source for t his non-Gaussianity. A posterior 
analysis ijCruz et alJ 2005) confirmed the anomalous nature 
of the CS by performing an area-based statistical analysis of 
the wavelet coefficients. This detection, as well as new ones, 
were confirmed by analyzing WMAP with various wavelet 
bases and several statistical estimators dMukheriee fc Wang 



20041: ICavon et all l2005l: iMcEwen et all 120051: ICruz et al 



20061; iMartfnez-Gonzalez et all 120061; ICruz et all l2007al; 



McEwen et alJ l200rj ; iPietrobon et all |200S| ; IWiaux et all 
20081 ). Isotropy deviations were also reported in different 



the CMB dCoDi et al. 


2004; de Oliveira-Costa et all 2004 


Katz & Weeks! 2004; 


Schwarz et all |2004|; Bielewicz et all 


2005; Land & 


Magueiio 2005b; Abramo et al. 2006; 


Freeman et all 12009; lLand & Magueiiol 120071) ; north-south 



asymm etr ies of the C MB fl uct uations (jEriksen et al 



2004al lbl: 
20051: 



Eriksen et al 



Hansen et all 2004al l 3; Donoghue fc D onoghue 



20051; lLand fc Magueiiol 



2005a 



Bernui et all 120061 120071 ; lEriksen et all 120071 ; iRath et ail 
20071; iGordonl 120071); an anomalous variance value 
( Monteserm et al 20081); unexp e cted correlation 



the CMB phases 



IChiang fc Naselskvl 



;UUa); unexpe cted correlation among 
Chiang et all 120031 ; IColes et all |2004| ; 



2006) ; unbalanced dist ribution of the 



temperature extrema ( Toieiro et all 120061) ; and anoma- 
lous alignment of CM B structures ( Wiaux et all 120061 ; 
IVielva et al.ll2006l . l2007n . 

All the previous analyses can be considered as blind 
approaches, since null tests were performed to probe the 
CMB compatibility with the isotropic and Gaussian hy- 
potheses. Complementary to these works, the reader can 
also find in the literature several studies where targeted 
departures from Gaussianity are explored, based on non- 
standard physical models. In particular, several analy- 
ses have studied the WMAP data compati bility with 
anisot ropic universes: the Bianch i VHh model ( Jaff e et ail 
l2006allblkl ; iBridges et all l2007al . |200S| ). Recently, some 
works have prop o sed n on-rot ational invariant model s like 
iBohmer fc Motal (120081) and lAckerman et"aT] l|2007fl (ex- 
prored by Groeneboom fc Eriksenl " 20081 . in the context 
of WMAP data). Although these non-rotational invariant 
models are promising and provide us with anisotropic tem- 
plates that could help to fix some anomalies in WMAP 



data, more work is still needed to connect these anisotropic 
patterns of the CMB fluctuations to a satisfactory physi- 
cal model describing the evolution of the anisotropic field 
jHimmetoglu et al.ll2008aH bh . 

In addition to the previous analyses, several works have 
studied different hypothese s to explore the a nomalous na- 
ture of the CS; for instance ICruz et all (|2008h explored the 
CS compatibility with different non-standard models, point- 
ing out that an explanation in ter ms of a cosmic texture 
(as proposed by I Cruz et all l2007bl ) is much more favored 
than other alternatives already discussed in the literature, 
like a very large void in the large scale structure of the 
Universe or contamination, in the form of the Sunyaev- 
Zeldovich emission, due to a large and nearby galaxy clus- 
ter. However, the study of non-standard inflationary mod- 
els is the problem that has attracted a larger attention. 
For instance, and for the WMAP case, the non-linear cou- 
pling parameter fNL that describes the non-linear evolution 
of the inflationary potential (see e.g. iBartolo et al. 1 12004 
and references therein) has been co nstrained by several 
groups: using the angu l ar bispectrum feomatsu et al. 2003: 
Creminelli et all 120061 ; ISpergel et all 120071 ; [ Komatsu et al 
2003) ; applying the Minkowski functionals ( Komats u et al 



Spcrgel et all 



20081 ; iKomatsu et al 



20071; iGott et all 120071 : iHikage et al 
20081) ; and using different statistics 



based on wavelets dMukheriee fc Wa ng 2004; ICabella et all 
120051 ; ICurto et alll2008l ). Besides a claim for f NL > wit h 
a probability greater than 95% (lYadav fc Wandeltl 120081 ) . 
there is a general consensus on the WMAP compatibility 
with the predictions made by the standard inflationary sce- 
nario at least at 95 % confidence level. The current best limits 
jCurto et al.ll200Sl) are: -8 < f NL < 111 at 95% CL. 

The present paper is related to non-blmd or targeted 
probes for Gaussianity deviations, and, more specifically, it 
addresses two common topics arose in these kind of studies: 
the definition of an optimal estimator for the non-standard 
model (and, therefore, the optimal estimation of the param- 
eters that define such a model) and the complementary is- 
sue of model selection. The former aspect is, usually, quite 
complex, since, for many physical models and realistic obser- 
vational limitations (e.g. incomplete sky coverage) there is 
not a trivial solution and, therefore, non-optimal estimators 
are usually proposed (which are posteriorly characterized by 
simulations). The latter issue on model selection is equally 
complex, since, generally, relies on heuristic principles. Some 
authors adopt a fully Bayesian view, whereas others adopt 
asymptotic measurements for the distance between two hy- 
pothesis, and others prefer to rely on the information that 
can be obtained from the likelihood itself. In this work we 
propose to study a non-Gaussian model for the CMB that 
is a local non-linear expansion of the temperature fluctua- 
tions. For this model — that, at large scales, is considered 
by some authors as an approximation to the non-linear cou- 
pling parameter fNL — we are able to build the exact like- 
lihood on pixel space. To work in pixel space allows one to 
include easily non-ideal observational conditions, like incom- 
plete sky coverage and anisotropic noise. We show how, from 
this likelihood, it is straightforward to obtain an analytical 
expression for the optimal estimator for the non-Gaussian 
term. In addition, we also explore different model selectio n 
criteria, like the Akaike informatio n criterion llAkaikelll973l ). 
the Bayesian information criterion (Schwarz 19781 ). the min- 
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imum description length l|Rissanenll200ll ), the Bayesian evi- 
dence, and the generalized likelihood ratio test. 

The paper is organized as follows. In Section [2] we de- 
scribe the physical model based on the local expansion of 
the CMB fluctuations and derive the full posterior proba- 
bility. In Section [3] we address the issue of the parameter 
estimation (|3.1fl and model selection (|3.2[) . The methodol- 
ogy is explored on WMAP-like simulations in Section [4] and 
it is applied to WMAP 5-year data in Section \5\ Finally, 
conclusions are given in Section [6] 



2 THE NON-GAUSSIAN MODEL 

Current observations indicate that CMB temperature fluc- 
tuations can be well described by random Gaussian fluctu- 
ations, as predicted by the standard inflationary scenario. 
However, as it was discussed in the Introduction, these ob- 
servations still allow for a small departure from the Gaussian 
distribution, which could reflect the role played by physical 
processes described by non-standard models for the struc- 
ture formation. 

In this paper, we will focus on a parametric non- 
Gaussian model, that accounts for a small and local (i.e. 
point-to-point) perturbation of the CMB temperature fluc- 
tuations, around its intrinsic Gaussian distribution: 

AT, = (ATi) G + a [(ATj)g - ((AT,&>] + b (AT*)* . (1) 

(ATj) G (the linear term) is the Gaussian part, whose N- 
point probability density function (N-pdf) can be easily de- 
scribed in terms of the standard inflationary model. The 
second and third terms on the right-hand side are the 
quadratic and cubic perturbation terms, respectively. Their 
corresponding contribution to the observed CMB fluctua- 
tions (AT;) is governed by the a and b parameters. Subindex 
i refers to the direction corresponding to a certain pixeliza- 
tion on the sphere, and the operator (•) averages over all 
the pixels defining the sky coverage. Notice that the previ- 
ous expression does not include instrumental noise. We have 
adopted this simplification, since we aim to focus the work 
on large-scale CMB data sets (> 1°), where (as it is the 
case for WMAP), the contribution from noise is typically 
negligible. 

Let us point out that the particular situation for 
b = defaults into a well known c ase that has been al- 
ready addressed in the literature (e.g . iKomatsu et al.|[200ll ; 
ICavon et all 120031 ; ICurto et all 120071 ). It describes a local 
approximation to the weak non-linear coupling inflati on- 
ary model (e..g K omatsu et al.ll200ll ; iLiguori et al.ll2003T ) at 
scales larger than the horizon scale at the recombination 
time (i.e. above the degree scale). In this context, the a fac- 
tor is usally related to the non-linear coupling parameter 
£nl by: 

3fNL 



To 



(2) 



where To = 2, 725 mK is th e CMB temperature, and we 
follow the sign convention in iLiddle fc Lvthl i|200oh for the 
relation between the temperature fluctuations and the gravi- 
tational potential at the Sachs- Wolfe regime. Let us remark 
that, of course, this model do not pretend to incorporate 
all the gravitational (like lensing) and non-gravitational ef- 
fects, due to the evolution of the initial quadratic potential 



model. Indeed, the reason to select the specific model given 
by equation [T] is twofold. One the one hand, it is an useful 
parametrization for describing a small departure from Gaus- 
sianity, that allows us to present a new methodology. On the 
other hand, it is a model previously addressed by other au- 
thors (using other estimators), and, therefore, it is easier to 
make a straightforward comparison among the results. 

For simplicity, let us transform the Gaussian part 
(AT;) G into a zero mean and unity variance random vari- 
able 4>i, hence, equation [T] can be rewritten as: 



where: 



-AT, 



x t — <Pi + e ( 



,,-(AT )c 



-1 



ba 2 



(3) 



(4) 



and a 2 = ((AT;) G ) is the rms of the CMB fluctuations. Let 
us remark that, since the proposed non-Gaussian model is 
a perturbation of the standard Gaussian one, the non-linear 
parameters have to satisfy: 



N « l, 



< l. 



(5) 



It is straightforward to show that the normalized Gaussian 
field d> satisfies: 



i) = (0?) = 0, (0?) = 1, 



(6) 



where represents the normalized correlation between pix- 
els i and j. The N-pdf of the <fi — {^>i,4>2, 4>n} random 
field (where N refers to the number of pixels on the sphere 
that are observed) is given by a multivariate Gaussian: 



p(d>) = 



(2n) N / 2 (det£)V2 



(7) 



where £ denotes the correlation matrix and operator • de- 
notes standard matrix/vector transpose. 

Our goal is to compute the N-pdf associated to the non- 
Gaussian x — {xi, X2, xn} field, as a function of the non- 
linear coupling parameters based on the full N-pdf for the 
underlaying Gaussian signal 0. Hence, let us make first the 
inversion of equation [3] to find the expression of 4>i as a 
function of Xi\ 

4>i=Xi- e(x 2 - I) + e 2 [(2 - a)x\ - 2 Xi ] + 0(3). (8) 

Obviously, since previous equation is a local transfor- 
mation, the Jacobian matrix is diagonal and, therefore, the 
Jacobian (Z) is given by: 



det 



dxj 



n 



dx i 



(9) 



It is more convenient to work with the log- Jacobian (log Z), 
which, taking into account equation[8]and expanding the log 
function up to second order, is given by: 



logZ 



^ l0g \dx. 



-2iV+ (4- 3a)J2 x * 



(10) 



+ 0(3). 



Now, taking into account equation [3] and recalling that (b is 
a Gaussian field, it is straightforward to prove that the data 
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x satisfy: JZ i Xi = and — 1 + 0(2)- Therefore, 

the log-Jacobian is given by: 

logZ = Ne 2 (2- 3a) + 0(3), (11) 

Finally, it is easy to calculate p(x\e), the N-point pdf of x 
given the e parameter: 



p(a|e) = p(0 = 4>{x))Z = P {x\0)e l(xlc) 
where the probability p(x\0) is given by: 



P(sc|0) = 



(27r) Ar / 2 (det (|)) 1 /2 



e 2 



(12) 



(13) 



i.e., it is the N-point Gaussian pdf (i.e. p(x|0) = p((f> = x)), 
and the log-likelihood l(x\e) reads as: 



l(x\e) =logZ- i 1 t j>*-x$ V] 
It is easy to show that Z(cc|e) is given by: 

Z(x|e) = N[eR~ e 2 Q + 0(3)}, 
where the functions R and Q axe given by: 

-2 ¥l i 



^k 1 



and 



Q = -2 + J + otS, 
being I the unity vector of dimension N and 



(14) 
(15) 
(16) 
(17) 



J - 



and 



N 



2xC l [x(x*-l)]* + ±[x a 



r 1 



(18) 
(19) 



Let us remark that R and Q could be seen as a kind of 
generalized third-order and fourth-order moments, respec- 
tively (or, equivalently, in terms of the spherical harmonic 
coefficients, to the bispectrum and the trispectrum) . As an 
example, for the particular case of uncorrelated data (i.e., 
£ = S), it is straightforward to show that R = kg and 
Q = -5/2 + (3a - 2)k 2 + (5/2 - a)fc 4 , with k n = i f\ x ?. 

The N-pdf p(x\e) given by equation 1121 contains all the 
required information, on the one hand, to estimate the non- 
linear coupling parameter e and, on the other hand, to per- 
form a model selection (Gaussian vs. non-Gaussian). These 
two aspects will be studied in next Section. 

Notice that the parameter a (or, equivalently, b) can- 
not be estimated in this framework: it just appears as an 
arbitrary constant in the definition of Q (equation I17[) : it 
just controls the relevance of S in Q. If one were inter- 
ested in obtaining a posterior probability of the data x 
given both non- linear parameters (e and a), then it would 
be necessary to expand the local non-Gaussian model be- 
yond 0(3). However, we would always find an expression in 
which the non-linear parameter controlling the highest order 
in the expansion, acts as an arbitrary constant. For that rea- 
son, we keep terms up to 0(3). There is a discussion within 



the field (e.g . lOkamoto fc Hul[2002l: Ikogo fc Komatsull2006l : 



iBabichl 120051 : ICreminelli et al.ll2007l ) about to what extend 
the cubic term in the description of the weak non-linear 
coupling inflationary model (for which, we recall, the local 



non-Gaussian model in equation \T\ can be seen as an ap- 
proximation at large scales) is really negligible or not with 
respect to the quadratic contribution. For the former sce- 
nario (negligibility of the cubic term), the particular value 
of a becomes an irrelevant issue since, naturally, one will 
have that S <C — 2 + J (of course, a should always satisfy 
the condition given in equation [5} . However, for the latter 
case different results could be obtained, depending on the 
specific value for a. In the following, we will discuss pa- 
rameter estimation and model selection assuming a = in 
equation 1171 However, we will explore (analyzing WMAP 
data and simulations) whether the condition S <SC — 2 + J is 
naturally satisfied or not. 



3 STATISTICAL ANALYSIS 

In this Section we aim to address two aspects very much 
linked one to the other: the estimation of the parameter 
defining the local non-Gaussian model (e, Section 1 3 . 1 P and 
the computation of some heuristic rules to decide whether a 
given data set is better described by the local non-Gaussian 
model rather than by the standard Gaussian one (Sec- 
tionEHi 



3.1 Parameter estimation 

The description of the full N-pdf for the non-Gaussian model 
proposed in Section [2] allows one to obtain an optimal esti- 
mation of the non-linear coupling parameter e. 

Let us recall that an optimal estimation of the e pa- 
rameter is possible, since it would be derived from the full 
pdf for the non-Gaussian model, p(x\e). In other words, we 
could obtain an unbiased and minimum variance estimator. 
This is possible, precisely, for the specific selection of the 
local non-Gaussian model in equation [3] for other physical 
non- Gaussian models it is not always trivial to obtain a full 
description of the posterior probability of the data given the 
parameters (at least, under realistic observational conditions 
like incomplete sky coverage) and shortcuts have to be taken 
by defining pseudo-optimal estimators that are, afterwards, 
validated with simulations. 



3.1.1 Parameter estimation from the log-likelihood 

We shall define the optimal estimator for the non-linear pa- 
rameter as the value, e, that maximizes the probability of x 
given e. From equation ll2l it is obvious that maximizing this 
probability is equivalent to maximize Z(x|e) in equation 1151 
By derivation one obtains: 



R_ 
2Q' 



(20) 



We can also estimate the error associated to e from the 



Fisher matrix Fi 



= 2NQ: 



F. 



-1/2 



(2NQ)- 1/2 . 



(21) 



Notice that the error on the estimation of the parameter e 
is constant, up to the order considered in equation [3] 
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3.1.2 Bayesian parameter estimation 

Within the Bayesian framework we can include any a priori 
information that we might have in relation to the e parame- 
ter. In particular, following Bayes' theorem, the probability 
of e given the data x read as: 



p(e\x) (xp(x\e)p(e), 



(22) 



where p(e) is the prior probability function for the parame- 
ter e. Of course, for the case of the local non-Gaussian model 
proposed in this work, there is not a clear physical motiva- 
tion to choose a particular prior. 

Let us however explore, as an exercise, two simple sce- 
narios, which could be useful for more general purposes. We 
consider first a uniform prior given by: 



P(e) 



— if e G [e m) e M ] 



(23) 







otherwise 



where, obviously, the range allowed to e is such that e<l 
for any e £ [e m ,eM]- For this particular case, it is trivial to 
show that the Bayesian estimation for the non-linear cou- 
pling parameter (e) is equivalent to the one obtained via the 
maximum- likelihood estimation (i.e, e=e) if e 6 [e m ,eM]- 

The second case we want to address corresponds to a 
Gaussian prior p(e), described by a most probable value e* 
and a dispersion <r* : 



271-er* 



(24) 



By deriving the posterior probability, it is trivial to obtain 
the Bayesian estimation for the non-linear coupling param- 
eter (e): 



Na J R + e* 
2Na, 2 Q + l' 



(25) 



For the particular case of <r* — > 0, i.e. a very strong prior 
for e, peaked around e«, one trivially obtain e = e*, that 
is, the prior dominates Bayesian estimation, leading to a 
most probable value for e equal to the maximum value for 
the prior. Also trivially one finds that, for a non-informative 
scenario (i.e., <r» — > oo), e = e = R/2Q. 



3.2 Model selection 

In this subsection we aim to calculate under which condi- 
tions (according to different model selection criteria) a given 
observation x is better described by a local non-Gaussian 
model as the one described by equation [3] with |e| > 
(hereinafter Hi) rather than by a Gaussian random field 
described just in term of the N-point correlation function £ 
(hereinafter Ho). 

Some of the model selection approaches inves- 
tigated in this paper have been previously applied 
to differ ent astronomical/cosmological p roblems For in- 
stanc e, ([Szvdlowsk i fc Godlowskl 20061 : ISzvdlowski et all 

120061 : iBorowiec et al. I l200fj| ) applied the Akaike and the 
Bayesian information criteria (AIC and BIC, recpectively) 
to study whether astronomical data sets favored simplest 
models for the accelerating universe against more comp lex 
ones. This issue was also addressed bv lDavis" et all (|20071) by 



analyzing the ESSENCE supernova survey data, and appliy- 
ing Bayesian evidence (BE) in addition to the AIC and the 
BIC approaches, to study. These three model selection crite- 
ria (AIC, BIC and BE) were also applied to study the impact 
of no n-standard physical mo d els on t he Fr iedmann equa- 
tions (jSzvdlowski et al J I2008T ) . iLiddld <|2004l ) used the AIC 
and the BIC techniques to study, on the one hand, whether 
WMAP 1-year data preferred a spatially flat cosmology ver- 
sus a closed one, and, on the other hand the significance 
of the running spectral inde x detected on this WMAP data 
release. In a posterior work (jLiddlc 200^), BE was added to 
the AIC and the BIC approaches to study the suitability of 
different cosmological models to the WMAP 3-year data. 

Bayesian evidence (BE) is, probably, the model selec- 
tion criterion that has attracted a greater interest from 
cosmologists during the past years. In addition to the 
works mentioned above (where it was compared with 
other model selection criteria), it has been also applied 
to several problems where competing cosmological mod- 
els were explored. Some of the se applications are the fol- 
lowing: iMukheriee et al] l|2006h followed a BE approach 
to study cosmological models with different matter power 
spectr a and dark energy evolution models; iLiddle et al.l 
(|2006t) studied d i fferent dark energy evolving scenar- 



Bridges et ail (|2006l . l2007bl ) performed a model selec- 



tion on the matte r power spectrum from the BE analy- 
sis of the WMAP; iBridges et ail i|2007al , 120081 ) followed a 
similar approach for analyzing the W MAP compatibility 
with anisotropic Bianchi Vllh models; ICruz et all (|2007bl. 
2008) used it to decide whether the WMAP cold spot 
was compatible or not with prediction s from non-standard 
model s like the cosmological defects; IMukheriee fc Liddld 
(2008) studied the Planck ability to discriminate between 
several r e-ionization models; the BE criterion w as also 
applied l|Carvalho et al.l |200SI : iFeroz et all l2008bl ) to the 
problem of compa ct source de t ection on microwave data; 
and more recently, IFeroz et al.l |2008a) investigated differ- 
ent properties of the constrained minimal supersymmetric 
model (mSUGRA), using WMAP data. 



3.2.1 The Akaike information criterion (AIC) 

The Akaike information criterion (AIC, Akaike 1973]) pro- 
vides with a selection index to decide among competing hy- 
pothesis, being the model associated to the lowest index the 
most favored one. The Akaike index corresponding to a given 
model or hypotheses Hi defined by p parameters and with 
a maximum value for the log-likelihood of I is given by: 



AIC(Hi) = 2(p-l 



(26) 



From equation 1151 one can find that AlC(-fZi) = 
2(l- N^j, whereas, trivially, AlC(flo) = 0. Therefore, 
according to the AIC, the decision rule reads as: 



AIC 



H if -q < 



# 1 if £ > 4 



(27) 



Q JV 
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3.2.2 Bayesian information criterium (BIC) 

This a symptotic bayesian criterion introduced by ISchwarJ 
(j 19T8h is prior independent. It is based on the BIC function, 
that provides a measurement of the goodness-of-fit of the 
model to the data, taking into account the number of pa- 
rameters defining the model as well as the amount of data 
(N): 



BlC(Hi) = (-21+ pin N^j , 



(28) 



where p is the number of parameters defining the data and I 
is the maximum value for the log-likelihood. As for AIC, BIC 
provides a ranging index for competing hypothesis, where 
the one with the lower BIC value is the most favored one. 
From equation 1151 one can find that (for our specific prob- 
lem) BlC(-ffi) = -N§t + In (iV) and BIC(# ) = 0. There- 
fore, according to the BlC, the decision rule reads as: 



BIC 



Ho if %■ < jjlnN 



(29) 



3.2.3 Minimum description length (MDL) 

MDL (e.g. Ri.-.-anoii 2001 i is an inference approach mostly 
developed during the 80s and 90s, based on the key idea 
that the more regular a given data set is, the higher is the 
compression degree to which we can code the data, and, 
therefore, the more we can learn on the properties of the 
data. Among many statistical applications, MDL is used to 
select between competing models describing the data, select- 
ing the one allowing for a higher compression degree (which 
can be seen as an alternative formulation of the Occam's 
Razor) . 

For our particular case described by equation [3] the 
MDL measurement of compression is given by: 



MDL( e ) = -f+ihW |J ] -h, 



de[detF s ] 1/2 , (30) 



where Fi is the Fisher matrix of e. Therefore, taking 
into account equations [12] and 1151 one can easily compute 
MDL(i/i) and MDL(iio) providing a decision rule that 
reads as: 



MDL : I 



Ho 



if SL < ± In 

11 Q N 111 



Hi if 



R- 

Q 



(31) 



where f2 is the interval where e is defined. 



where v is an arbitrary value indicating the strength in 
choosing Hi instead of Ho- Therefore, according to the 
GLRT, the decision rule reads as: 



GLRT 



Ho if \ < ±u 



Hi if f > > 



(33) 



Notice that the case v = 1 provides the same decision rule 
as the AIC (equation I27|) . and that v = InviV corresponds 
to the BIC case (equation 129( 1 . 

3.2.5 Bayesian evidence (BE) 

BE is defined as the average likelihood of the model Hi in 
the prior p(t): 

E Hi (x) = J dep(e,Hi)p(x\e,Hi), (34) 

where p(x\e, Hi) is given by equation ll2l Model selection in 
terms of the BE grounds on the Bayes' factor, Bio: 

E Hl jx) 



Bio(x) 



(35) 



E Ho (x) 

BE framework provides a rule to quantify how strong the 
decision is. In the literatu re it is commonly accepted the Jef- 
freys' scale ([Jeffreys 196l|), that provides a recipe in terms of 
the logarithmic Bayes' factor. Roughly speaking, it is com- 
monly said that the evidence for Hi against Ho is not sig- 
nificant if ^ InBio(x) < 1, mild if 1 ^ lni?io(a;) ^ 3 and 
strong if \xiB\o(x) > 3. 

As it was already discussed, the alternative hypothesis 
Hi representing the local non-Gaussian model (equation [3j 
does not offer any physical motivation for a particular prior. 
Even thus, we study here the two particular cases already 
mentioned in subsection 13.1.21 the Gaussian (equation I24J1 
and the uniform (equation I23|l priors. 

For the former, it is straightforward to prove that: 

flio= . p e'^ ^tH^gy, (36) 
whereas for the latter, one can obtain: 



Bin — 



2 t M 



erf 



+ erf 



V2o-i 



(37) 

where e is the maximum-likelihood estimation for the non- 
linear parameter e (equation I20[) and oi is the error on this 
estimation (equation I21|l . Finally, notice that for the par- 
ticular cases of <r* 3> erg or e M — e m 3> erg in equations 1551 
and 1371 respectively (i.e., a broad prior), one obtains: 



3.2.4 Generalized likelihood ratio test (GLRT) 

Generalized likelihood ratio test (GLRT) is one of the most 
common approaches in model selection and its particular 
application to solve astronomical/cosmological problems has 
been very extensive. 

The criterion established by the GLRT to accept the al- 
ternative hypothesis Hi against Ho is given by p(x\e, Hi) > 
e l/ p(x\0, Ho) or, equivalently by: 

l = l(e)=N^>v, (32) 



where 7 

(fM — Cm) / 



&io ~ — e e , 

7 

<r* (for the Gaussian 
2tv (for the uniform prior). 



(38) 



prior) or 7 



4 APPLICATION TO WMAP SIMULATIONS 

In this Section we aim to explore the performance of 
the parameter estimators and the model selection crite- 
ria described in the previous Section. We apply them to 
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Figure 1. Mask at NSIDE=32 HEALPix resolution used in this 
work. It corresponds to the WMAP KQ75 mask, although the 
point source masking has not been considered, since the point 
like-emission due to extragalactic sources is negligible at the con- 
sidered resolution. At this pixel resolution, the mask keeps around 
69% of the sky. 



CMB simu lations of the WM AP 5-year data at NSIDE=32 
HEALPix (iGorski et al.ll2005h resolution (-2°). 

The procedure to generate a CMB Gaussian simulation 
— (AT) G in equation [l] — is as follows. First, using the Ce 
obtained with the cosmological parameters provided by the 
best- fit to WMAP data alone (Table 6 in iHinshaw et alj 
2008), we simulate WMAP observations (taking into ac- 
count the corresponding beam window functions) for the 
Ql, Q2, VI, V2, Wl, W2, W3, W4 difference assemblies 
at NSIDE=512 HEALPix resolution. We obtain a single co- 
added CMB map through a noise-weighted linear combi- 
nation of the eight maps (from Ql to W4). Weights are 
proportional to the inverse mean noise variance. They are 
independent on the position (i.e., they are uniform across 
the sky for a given difference assembly) and they are nor- 
malized to unity. Notice that we do not add a random noise 
realization to each map, since we have checked that noise 
plays a negligible role at the angular resolution in which we 
are interested in (~ 2°). However, we perform the linear 
combination of the difference assembly maps following the 
procedure described above, since it will be the same pro- 
cess that we will follow with the WMAP dataQ Afterwards, 
the co-added map at NSIDE=512 is degraded down to the 
final resolution of NSIDE=32. Finally, a mask representing 
a sky coverage like the one allowed by the WMAP KQ75 
mask dGold et alJl200Sl ) is adopted. At NSIDE=32 the mask 
keeps around 69% of the sky (notice that we do not consider 
the masking due to point sources, since at this resolution 
the contribution from individual extragalactic point sources 
is negligible, see figure [TJ. Let us remark that observational 
constraints like incomplete sky coverage can be easily taken 
into account by the local non-Gaussian model proposed in 
this work, since it is naturally defined in pixel space. 

We have used 500,000 simulations of (AT) G , generated 
as described above, to estimate the correlation matrix £ ac- 
counting for the Gaussian CMB cross-correlations. We have 



1 Co-added WMAP 5-year data is made in this way to produce 
a final map with a noise level smaller than, for instance, the one 
that could be achieved just by averaging the 8 difference assembly 
maps, assuring better a negligible noise contribution to the final 
map at resolution of 2°. 



computed this large number of simulations to assure an ac- 
curate description of the CMB Gaussian temperature fluc- 
tuations. Additional 1,000 simulations were also generated 
to carry out a statistical analysis on the performance of the 
different parameters estimators and model selection criteria. 

Each of these 1,000 (AT) G simulations are transformed 
into x (following equations [1] and [4} to study the response 
of the statistical tools as a function of the non-linear e pa- 
rameter defining the local non-Gaussian model proposed in 
equation [3] 

4.1 Parameter estimation 

Let us first consider the estimation made via maximum- 
likelihood estimation (Subsection 13.1 . 1 p . As it was men- 
tioned above, we have generated 1,000 non-Gaussian 
WMAP-like observations according to equation|3]for a range 
of values of e, in particular, we have considered e £ [0, 0.035], 
or equivalently, in terms of the most common coupling fNL 
parameter (equation [2}, we explore fNL £ [0,500]. We only 
explore positive values of the non-linear parameter, since the 
response of the proposed methodology does not depend on 
the sign of e. 

In figure [2] we present the e distributions obtained from 
the analysis of local non-Gaussian simulations. From left to 
right and from top to bottom, the panels show the cases: 
e = 0, 0.001, 0.002, 0.003, 0.004, 0.005, 0.010, 0.015, 0.020, 
0.025, 0.030 and 0.035. Notice that, as expected, the pa- 
rameter estimation is unbiased for reasonable values of the e 
parameter: all the distributions are peaked around the value 
of e used to generate the simulations. Only for e > 0.025 (or, 
equivalently, fNL ~ 350) a bias starts to appear. This effect 
comes from the fact that these values of e are not small 
enough to assure a local non-Gaussian model as the one de- 
scribed by equation |3j i.e., a non-Gaussian model that is a 
local perturbation of the underlaying CMB Gaussian signal. 
Also as expected (see equation I2ip . the width of these dis- 
tributions does not depend on the particular value of e, if, 
once more, e is small enough to assure a proper expansion for 
the local non-Gaussian model. In this regime, we obtain, on 
average, oi w 0.004, or, equivalently, <r f - w 60. Again, for 
e > 0.025 the width of the distributions starts to be slightly 
smaller, indicating an inadequate value of the non-linear pa- 
rameter. These two effects (bias of the maximum-likelihood 
parameter estimation and dependence on the error on the 
parameter estimation) provide a natural range (at least for 
a pixel resolution of « 2°) where the non- linear parameter 
is allowed to take values: e £ [—0.025,0.025]. 

This allowed range for e can be seen as a natural prior 
p (e), that could be used for performing a parameter estima- 
tion within a Bayesian framework. Obviously (as discussed 
in Subsection 13 . 1 . 2 [I . the Bayesian estimation made with this 
uniform prior produces the same estimations for e already 
reported from the maximum-likelihood. 

Finally, we have also investigated whether the value of 
S in equation [17] is negligible as compared to —2 + J, which, 
as discussed in Section^ would lead to a situation where the 
choice of a particular value for the non-linear parameter a 
becomes an irrelevant problem. We found that, actually, this 
is not the case: on average, \S/ (—2 + J)| « 0.7. This implies 
that, first, different values of a could provide different re- 
sults, not only to the ones presented in this work, but also for 
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Figure 2. Distributions for the maximum-likelihood estimations of the non-linear parameter e obtained by analyzing 1,000 simulations, 
according to the local non-Gaussian model given in equation [3] Several values of e, or, equivalently of the coupling fp^L parameters (both 
numbers are written in each panel) are explored. Vertical dashed lines indicate the value of the e used to generate each set of local 
non-Gaussian simulations. 
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other works in the literature (where, we recall, it is assumed 
q = 0). In particular, it is trivial to show that, values of 
a < 0.15 would affect the determination of the coupling pa- 
rameter e in ~ 10%. Second, as i t has been discussed by so me 
authors jOkamoto fc Hull2002l ; Ikogo fc Komatsul l200rj) . it 
would indicate that, for the weak non-linear coupling infla- 
tionary model, the role played by cubic terms could be non 
negligible as compared to quadratic contributions, since, to 
some extent, they will contribute to the full trispectrum (as 
one can notice from equation 1171 where a governs the role 
played by S). This results could be important when describ- 
ing more complete non-local non-Guassian model, since it 
would indicate the need of including physical effects up to 
third order. 

4.2 Model selection 

We discuss which is the performance of the different model 
selection criteria presented in Subsection 13.21 For the par- 
ticular case of BE, we assume the natural prior p (e) found 
in the previous Subsection: e £ [—0.025,0.025]. As for the 
case of the parameter estimation previously discussed, we 
have considered a set of local non-Gaussian models given by 
different values of the non-linear parameter. 

We present the results, graphically, in figure [3] This 
plot consist in 6 rows and 5 columns. Each row corresponds 
to the results obtained for a given value of the e parameter 
(namely, from top to bottom: e = 0, 0.005, 0.010, 0.015, 0.020 
and 0.025). Each column refers to a model selection crite- 
rion (from left ro right: AIC, BIC, MDL, BE and GLRT). 
For the first column (i.e., the AIC case) we plot the dis- 
tributions (obtained after analysing 1,000 simulations) of a 
statistical variable defined as: Waic = R 2 /Q — 4/iV, no- 
tice (from equation I27|l that a positve value of Waic implies 
to accept Hi against Ho- The second column accounts for 
the distributions of the variable Wbic = R 2 /Q — 2 In N/N, 
which (from equation I29[l also satisfies to be positive when 
favoring Hi . Equivalently, third column shows the distribu- 
tions of Wmdl = R 2 /Q - 4/iV In NQ-s/Q/tt, that according 
to equation [31] is also positive when Hi is more likely than 
Ho. Fourth column provides In Bio, obtained from equa- 
tion 1371 Finally, in the fifth column we present the distri- 
butions obtained for v — (R 2 /iV) / (4/iV) which, according 
to the GLRT decision rule (equation 133 p provides a mea- 
surement of the strength in accepting Hi against Ho. 

In addition to the distributions, we also plot, as an in- 
dication, a vertical line in each panel. For Waic, Wbic and 
Wmdl, this vertical line separate the region where Ho is fa- 
vored (left side) from the one where Hi is more likely (right 
side). The percentage of the 1,000 simulations that fall on 
the Hi region is also reported at each panel. The vertical 
line for the Wbe statistic represents the limit for which the 
logarithmic Bayes' factor (In Bio) is greater than 1, which, 
in terms of the Jeffreys' rule, indicates that Hi is, at least, 
mildly favored against Ho. We also provide the percentage 
of the simulations satisfying this condition. Finally, the ver- 
tical line for the Wglrt statistic indicates the value of v 
for which 95% of the 1,000 simulations favor Hi instead of 
Ho- This value of v is also written in the panels. Notice 
that, among the asymptotic model selection criteria (AIC, 
BIC and MDL), AIC offers a less restrictive criterion than 
BIC, whereas BIC behaves similarly with respect to MDL, 



The figure also shows that BE provides a more conservative 
criterion than the asymptotic methods. 



5 APPLICATION TO WMAP 5- YEAR DATA 

We have applied the statistical approaches described in Sec- 
tion [3] to WMAP 5-year data. In particular, we have ana- 
lyzed a co-added CMB map generated from the global noise- 
weighted linear combination of the reduced foreground maps 
for the Ql, Q2, VI, V2, Wl, W 2, W3 and W4 difference as- 
semblies fsee lGold et al. 2008, for details). Weights are nor- 
malized to unity and, for each map, they are proportional to 
the inverse average noise variance across the sky. This op- 
eration is made at NSIDE=512 HEALPix resolution, being 
degraded afterwards down to NSIDE=32. 

Hence, we are in the same conditions as for the analysis 
on simulations described in the previous Section and, there- 
fore, the CMB cross-correlation in WMAP data is given by 
the £ correlation matrix already defined in Section[4] The es- 
timated full N-pdf of the WMAP 5-year data given the non- 
linear parameter e is showed in figure [4] (indeed, it is given in 
terms of the most common fsjL parameter for allowing a bet- 
ter comparison with previous works). Maximum-likelihood 
estimation (equation l20|l provides fNL = 30 with an error for 
the parameter (equation 12 1[) of <Tj nl = 62 (compatible with 
the values obtained from simulations). Hence our WMAP 
5-year data analysis reports: fNL = 30 ± 124 at 95% CL 
(or, equivalently, e = 0.019 ± 0.078 at 95%). This result 
is compatible with similar works in the literature reporting 
WMAP compatibility with Gaussian hypothesis. However, 
let us remark that this estimation is more efficient than pre- 
vious ones at similar angular res olution, since it pro vides 
a smaller error bar. For instance, ICurto et alj l|2007h per- 
formed a Gaussianity test on the WMAP data, at the same 
HEALPix resolution, although in a smaller region of the 
sky (16% instead of the 69% considered in this work). Their 
fNL estimator was based on the three Minkowski functionals, 
providing an error bar of « 200. The expected 0"{ nl provided 
by our maximum-likelihood estimator for a similar observed 
region would be a% ~ 124, i.e., ss 40% smaller than the 

° *NL ' ' 

one obtained by the Minkowski functionals. This result was 
expected since, as it was already discussed, the maximum- 
likelihood estimation of the non-linear parameter is optimal, 
given the local non-Gaussian model in equation [T] 

As it has been mentioned above, this result shows 
WMAP data compatibility with the Gaussian hypothesis, 
since fNL = can not be rejected at any significant confi- 
dence level. Of course, same conclusions are obtained from 
the model selection criteria described in Subsection l3.2l Nei- 
ther AIC, BIC, MDL nor BE criteria select Hi against Ho, 
whereas GLRT would favor Hi under the very weak condi- 
tion for v in equation 1331 of v w 0.1 (which implies a likeli- 
hood ratio of m 1.1) 

Finally, let us remark that, as it also happened for the 
simulations analyzed in Section [4] the contribution of the S 
term to the value of Q (see equation I17|l is not negligible, 
in particular, \S/(— 2 + J)| = 0.74. This indicates that, as it 
was already mentioned, the cubic term in the model given 
by equation [3] is not naturally negligible as compared to 
the quadratic term and, therefore, a should be chosen small 
enough (like the case a = considered in this work). 
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Figure 3. From left ro right, columns show the distribution for several statistics referred to different model selection criteria: AIC, B1C, 
MDL, BE and GLRT. From top to bottom, results for local non-Gaussian models for different non-linear parameters are give: e = 0, 
0.005, 0.010, 0.015, 0.020 and 0.025. Panels in columns 1, 2, 3 and 4 present a vertical line, separating the region where Hq and H\ are 
preferred (to the left and to the right of the vertical line, respectively). The vertical line on the last column represents the value for the 
v parameter for which 95% of the non-Gaussian simulations are more likely described by H\ rather than Hq. 
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Figure 4. This curve represent the probability given in equa- 
tion [12] i.e. the full pdf of the WMAP 5- year data x given the 
non-linear parameter e (or, equivalently, f^L,)- Vertical dotted line 
marks the maximum-likelihood estimation f^L = 30. 



6 CONCLUSIONS 

We have presented a parametric non-Gaussian model for the 
CMB temperature fluctuations. The non-Gaussian model 
is a local perturbation (up to third order) of the stan- 
dard CMB Gaussian field which recovers (for the case of 
b = in equation [TJ an approximative fo rm of the weak 
non-l i near coupling inflat ionary model (e.g. iKomatsu et al.l 
l200ll ; iLiguori et alJliool ) at scales larger than the horizon 
scale at the recombination time (i.e. above the degree scale). 
For this model, we are able to build the posterior prob- 
ability of the data given the non-linear parameter e (see 
equation from which, in principle, an optimal estimator 
(i.e., unbiased and with minimum variance) can be derived. 
Analytical expressions for the maximum-likelihood estima- 
tion of the non-linear parameter (e) and its associated error 
(erg) are derived. In addition, we also discuss an alterna- 
tive Bayesian estimation (in terms of the posterior prob- 
ability of the non-linear parameter), for the hypothetical 
case in which we might have some prior information for e. 
As an example, two cases are addressed: a non-informative 
(i.e, uniform) and a Gaussian priors. We also investigate 
an issue very much linked to the parameter estimation: the 
model selection. Indeed, we discuss several well known tech- 
niques to perform hy potheses test , like the Akaike infor- 
mation criterion ( AIC, [A kaike 1973), the Bayesian informa- 
tion criterion (BIC,[S chwarz 1978J), the minimum description 
length (MDL, Rissancn 2001), the generalized likelihood ra- 
tio test (GLRT) and the Bayesian evidence (BE). We derive 
analytical expressions, for the particular local non-Gaussian 
model proposed in this work, for all these model selection 
techniques. 

The performance of both, parameter estimators and 
model selection criteria, are investigated by analyzing non- 



Gaussian simulations, as they could be observed by WMAP. 
We check that the maximum-likelihood estimation provides 
an unbiased and efficient estimation of the non-linear pa- 
rameter defining the deviations from Gaussianity. We find 
that, for the HEALPix resolution considered in this work 
(NSIDE=32), results are consistent up to a value of e = 
0.025, which approximately corresponds (at the Sachs- Wolfe 
regime) to a value of the the non-linear coupling parameter 
fNL ~ 350. This parameter is the one commonly used to 
descr ibed the weak non-li near coupling inflationary model 
(e..g IKomatsu et al.l [200lh . We also find that, among the 
model selection criteria, AIC is the asymptotic method that 
provides the less restrictive decision rule, whereas, on the 
other hand, MDL is the most strict one. We also find that 
BE, for a uniform prior given by e £ [—0.025, 0.025], is even 
more restrictive than MDL. 

The proposed methodology is applied to WMAP 5-year 
data. We obtain a value for the non-linear coupling parame- 
ter of £nl = 30± 124 at 95% CL. This result provides a more 
efficient estimation than previous works in the literature, at 
the sam e angular scales. F or instance, comparing with the 
work bv lCurto et alj (120071 ) using Minkowski functionals, we 
can infer that the maximum-likelihood error bar is « 40% 
smaller than the one obtained with those geometrical esti- 
mators. Application of model selection criteria to WMAP 
data confirms that standard hypothesis of Gaussianity is fa- 
vored against the alternative hypothesis of non-Gaussianity, 
for the specific local model proposed in this work, and for 
the adopted resolution of w 2°. 

Finally, we would like to comment that, currently, we 
are extending the technique based on the N-pdf presented 
in this work, to deal with a more realistic non-local non- 
Gaussian model, where higher resolution CMB data are con- 
sidered, including as well the effect of anisotropic noise. 
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